Performance of Epworth Sleepiness Scale and tiredness symptom used with simplified diagnostic tests for the identification of sleep apnea

Objective: To compare the performance of Epworth Sleepiness Scale (ESS) and tiredness symptom with the apnea-hypopnea index (AHI) in a population referred to home sleep tests. Materials and Methods: This correlational study assessed adult patients through questionnaires and respiratory polygraphy (RP). We estimated sensitivity (S), specificity (Sp), predictive values (PV), odds ratio (OR) adjusted multivariate model and area under ROC curve for each sex and severity. Results: We analyzed 4424 patients, 2761 (62.4%) men and 1663 women, aged 53.6 (42-65 years old) with BMI of 31.3 (25.5-36.1). 78.4% had AHI >5 events/hour. RP (ev/h) indicators were (men vs. women): AHI of 22.8±19.2 vs. 13.2±13.3, ODI of 22.7±19.9 vs. 14.0±13.7, and T <90%: 19.3±26.1 vs. 15.6±25.3. Men presented higher severity levels, night-time hypoxemia and CPAP indications (52.2 vs. 29.2%) p<0.0001. ESS > 10 was found in 25% of population: 8±5.15 in men vs. 7.6±5.1 in women, p<0.001. 12% of men (as compared to 31.5% of women) with ESS > 10 had a normal AHI. 72% of women reported tiredness (vs. 66.1% of men). The R2 between Epworth Scale and AHI showed: 0.022 (CI95%: 0.111-0.185) p<0.0001 in men and 0.0019 (CI95%: -0.004 to 0.092) p>0.074 in women. Logistic regression showed Epworth Sleepiness Scale >10 for each AHI severity category (OR between 1.38 and 1.31 with p<0.05) and tiredness for AHI >30 ev/h only in men (p<0.004). Conclusions: Epworth Sleepiness Scale >10 demonstrated a low screening performance only when present in male patients. Tiredness performed worse. Due to its limited value in the identification of sleep apnea patients, subjective somnolence should be considered in the context of an objective evaluation.


INTRODUCTION
Obstructive sleep apnea-hypopnea syndrome (OSA) has emerged as a public health concern due to its high prevalence in the general population and the associated high morbidity and mortality rates 1 .
OSA diagnostic criteria are based on an Apnea-Hypopnea Index (AHI) of >5 events per hour (ev/h) associated with excessive daytime sleepiness and/or cardiovascular or metabolic comorbidities. In middle-aged individuals, OSA prevalence 2,3 was estimated at 5-9%. Nevertheless, in according to recent data 4 , OSA overall population prevalence ranges from 9% to 38%, lying close to 28% in Latin America 5 . Such finding calls for pragmatic diagnostic strategies 6 .
Traditionally, OSA diagnosis is confirmed with polysomnography (PSG) though a duly validated respiratory polygraphy (RP) is also accepted in populations with a high/low clinical likelihood of suffering from OSA 7,8 .
Excessive daytime sleepiness is a relevant symptom because of its direct impact on patients' quality of life 9 and traffic accidents [10][11][12] , in addition to be a significant marker of poor sleep quality 13,14 . However, not all patients with sleep disorders report excessive daytime sleepiness and individuals with high PSG AHI values often do not report this symptom at all 15 . On the other hand, excessive daytime sleepiness can also result from inappropriate sleep habits or psychiatric disorders, like depression 16 . In fact, lifestyle-related chronic sleep deprivation is one of the most frequent causes of daytime sleepiness 17 .
Current tools to assess excessive daytime sleepiness include self-administered subjective scales based on validated questionnaires against reference methods and objective tests like the Multiple Sleep Latency Test (MSLT) and the Maintenance of Wakefulness Test. But, besides limited availability in some contexts, they are time consuming and must be conducted in a proper setting.
In routine clinical practice, referral centers continue to use subjective sleepiness scales (ESS, Stanford Sleepiness Scale, and Pediatric Daytime Sleepiness Scale) because their implementation is inexpensive and does not require complex training/equipment.
The subjective sleepiness scale known as Epworth Sleepiness Scale (ESS) 18 was first described by Murray W. Johns at Melbourne's Epworth Hospital in Victoria, Australia, in 1991. Soon after its publication, it was translated into Spanish and its use became frequent worldwide among populations suffering from OSA 19 . Several studies have assessed the advantages and disadvantages of ESS with significantly different results.
A systematic literature search by Sil and Barr found 5 studies that reported a significant correlation between ESS and AHI and 11 studies that showed no correlation. In addition to the fact that the statistical tools used in these studies were heterogenous (Spearman's rank correlation coefficient, logistic regression, multivariate analysis, ROC analysis, etc.), OSA was defined by either PSG AHI (with inconsistent cut-off points) or the comparison of ESS score with the multiple sleep latency test results 20 .
There is not much available information about ESS' performance in OSA patients in Argentina. A validation study conducted in a small patient population using PSG 21 reported that ESS scores >10 had a high positive predictive value (PPV). However, in our study, ESS showed modest performance in the identification of relevant OSA in 614 patients assessed through RP 22 .
In spite of these limitations, ESS scores >10-12 points are considered significant before a diagnostic test 8,[23][24][25] . Finding a correlation between ESS scores and RP AHI in the population referred to our unit for sleep tests would allow us to detect sleepiness and thus, identify potential treatment candidates. The present analysis was implemented on a large patient sample referred to our center and assessed through home-based selfadministered RP.

OBJECTIVE
To compare the performance of ESS score, tiredness symptom and AHI in the population referred to home sleep tests.

MATERIAL AND METHODS Population
This is a retrospective correlational study based on a convenience sampling conducted at a Respiratory Medicine Center between January 2013 and December 2018 (6 years) in adult patients suspected of sleep disorders on the grounds of three cardinal symptoms: frequent snoring, excessive daytime sleepiness, or partner-observed apneas.
All study procedures conducted on human participants were followed pursuant to the standards of the national and institutional research committee and the Declaration of Helsinki of 1964, as amended. The protocol was approved by the ethics and institutional review committee (protocol number: CRI#968).
Patients with more than one RP recording were included only once ( Figure 1). Exclusion criteria applied to patients with daytime respiratory failure, heart failure, and those on mechanical ventilation or receiving supplemental oxygen. Records obtained during hospital stays or in post-surgical settings were also excluded.

Respiratory Polygraphy
RP recordings were taken at night (1 night only) at patients' homes using a self-administered technique (i.e. the patient sets and starts the RP device before falling asleep). Patients received proper training on the use of the device at the hospital the morning before the test. The training session lasted 20 minutes and was delivered by nurses with experience in sleep medicine. Additionally, patients received an instruction manual with pictures and information on how to set the device. RP devices used in this study were Apnea Link Plus-Air (ResMed; Australia) and Alice Night One (Philips-Respironics; USA). All polygraph data from at least three basic signals: pulse oximetry, thoracic effort band, and nasal pressure canula. Ancillary signals included body position, actigraphy, and snoring.
Tracings were included/excluded through manual edition under AAMS standards 8 . Recordings with more than 240 minutes of valid total recording time (TRT) (>4 hours) met criteria for analysis. Apnea was defined as a drop of >80% in air flow and hypopnea as a reduction of >50% in air flow associated with a >3% drop in oxygen saturation for more than 10 seconds in both cases. AHI was calculated as the number of respiratory events (apneas and hypopneas) per hour. All data were estimated on the basis of total recording time valid for analysis after manual edition by expert pulmonologists. AHI >5 ev/h was considered clinically significant. AHI severity categories used were mild (6-14.9 ev/h), moderate (15-29.9 ev/h), and severe (≥30 ev/h).

Questionnaires
When patients picked up the RP devices at our center, age, sex and anthropometric variables (body mass index [BMI] in kg/m 2 ) were systematically collected. Obesity was defined as BMI >30 kg/m 2 . Daytime sleepiness was assessed with a validated Spanish translation of the current ESS version and OSA probability with Berlin and STOP-BANG (SBQ) questionnaires 15,26,27 . Tiredness (T) was assessed with the SBQ question specifically related to this symptom.
Each patient completed a printed copy of ESS before receiving the RP device 19 and had to choose one option for each item (feeling sleepy or falling asleep in specific situations). Each item is scored from 0 to 3. The final score goes from 0 (no probability of falling asleep in any described situation) to 24 (high probability of falling asleep in all 8 situations described).
Non-drivers or visually impaired patients who were not able to complete ESS were not included.

Statistical Analysis
A frequency histogram and a Kolmogorov-Smirnov test were used to assess the distribution of study variables. Quantitative variables were expressed as standard deviation and mean values and qualitative variables as absolute values and percentages.
Odds Ratio (OR) was used to calculate Sensitivity (S), Specificity (Sp), Positive Predictive Value (PPV) and Negative Predictive Value (NPV).
To include variables correlating ESS scores >10< and T (tiredness) for the identification of patients through AHI severity in a logistic regression model, we conducted a multivariate analysis and a Student's t test or χ2 test/Fisher Test for quantitative or qualitative variables respectively. Once prediction variables were obtained, we used a multivariate forward stepwise analysis. The dependent variables were: Epworth Sleepiness Scale Value (dichotomous) and tiredness (present or absent); the independent variables being: sex, BMI (> or <30 kg/m 2 ), history of hypertension, depression diagnosis, sex, age and Berlin questionnaire (high or low risk) to obtain odds ratio (OR) and confidence interval (CI 95%) for AHI severity classification. Finally, ROC (AUC-ROC) curves were estimated using the Hosmer-Lemeshow goodness-of-fit test. A p value <0.05 was considered statistically significant.
The commercial software package SPSS 9.0 was used (SPSS Inc. Chicago. Illinois, USA) and Prism 7.04 (GraphPad, La Jolla, CA).

RESULTS
The study included 5743 patients. After the selection process (Figure 1), we analyzed data from a total of 4424 patients, 2761 (62.4%) men and 1663 women. Median age was 53.6 years old (42-65) and BMI was 31.3 kg/m 2 (25.5-36.1). The most frequently reported symptoms were snoring in men and sleepiness in women.  Figure 2).
Male patients presented higher levels of severity, nighttime hypoxemia, and higher rates of CPAP therapy prescription (52.2 vs. 29.2%) p<0.0001.
We found ESS scores >10 in 25% of the population and a statistical difference between men and women of 8±5.15 vs. 7.6±5.1, which increased with AHI severity in men ( Figure 3).
12% of men vs. 31.5% of women with ESS scores >10 had a normal AHI. Tiredness was reported by 72% of women and 66.1% of men. Inconsistency between (T: Tiredness) and ESS scores >10 was present in 50% and 40.7% of women and men respectively (Tables 2 and 3).
Correlation between ESS and AHI showed a R 2 of 0.022 (CI95%: 0.111-0.185) p<0.0001 in men and 0.0019 (CI95%: -0.004 to 0.092) p>0.074 in women (Figure 4). Table 4 presents S, Sp and AUC-ROC for each severity category based on AHI, showing that ESS scores >10 and Tiredness have a modest performance and a limited screening value in men.
Adjusted Logistic Regression showed the predictive capacity of ESS scores >10 for each severity category (OR between 1.38 and 1.31) and sleepiness for AHI >30 ev/h exclusively in men (Table 5).

DISCUSSION
This study carried out in a large sample shows a low correlation between subjective somnolence symptoms and objective indicators obtained by the home sleep test, which show the limited usefulness of subjective questions to identified sleep apnea patients or estimate their severity.
Several studies have assessed the relationship between ESS score and AHI in different populations [28][29][30][31][32] and cultures, which makes it difficult to compare or extrapolate results 20 . The use of different analytical methods in heterogeneous populations distracts our attention from a key question: Are the subjective symptoms of sleepiness good predictors of sleep apnea?      Patients suffering from sleep apnea may ignore the alterations in their breathing patterns during sleep, as clinical signs are often reported by their roommates or partners. Therefore, this population is hardly aware of the cardiovascular and/or metabolic risk they face.
This study was conducted on patients suspected of OSA referred to our unit to undergo a specific sleep test. One fourth of our sample revealed ESS scores >10 and high OSA prevalence (>78%). It is worth considering that both the traditional classification of severity is only an approximation 33 and the use of diagnostic RP may result in the underestimation of indicators 8,27 .
ESS questionnaire is routinely used to assess subjective sleepiness in patients with a clinical suspicion of OSA, though several scientific studies have consistently shown its low sensitivity 20 . In general, validations have been based on monitored PSG, rather than the now extensively used home-based RP 15,20,34,35 .
ESS is frequently used in our units and, therefore, we want to learn more about its performance in our population and its relationship with home-based RP results, another routine practice at our center.
While the American Academy of Sleep Medicine 8 regards ESS scores >10 as 'significant daytime sleepiness' and the In 2008, Rosenthal & Dolan 29 assessed Epworth S and Sp in OSA diagnosis using PSG in a 268-patient population. They reported S of 66% for a cut-off point of 10 with an area under the curve of 0.60. Thus, they demonstrated ESS was inappropriate for OSA diagnosis, which is consistent with our results. Though, our sex-differential analysis reveals ESS performs better in men.
In a local study using PSG and a clinical questionnaire adjusted for age, BMI, and respiratory distress index, Nigro et al. 37 stated women are less likely to report snoring and apnea. Likewise, they reported female sex was an independent predictor of sleepiness, though ESS >10 was not a predictor of OSA in line with our observations. OR analysis for different ESS cut-off points shows that male patients with ESS scores >10 are at a higher risk of high AHI. Logistic regression adjusted for confounders like obesity, neck circumference, sex, depression, Berlin questionnaire risk, hypertension, and age showed an OR be-   20 . Very few studies have used logistic regression and they have also found a low degree of correlation between ESS and AHI 25,34 . A British study that assessed a total of 238 patients with PSG 20 obtained similar ROC curves with an area of 0.6, concluding that the ESS usefulness is limited.
Osman at al. 35 state that ESS is not a good predictor of OSA based on its poor correlation with AHI and suggest daytime sleepiness can also exist in non-OSA snoring patients, which is consistent with our findings. However, it is worth noting that ESS scores >10 are associated with a higher probability of high AHIs in moderate to severe cases (candidates to receive CPAP therapy) with acceptable S (>75% in men and 91.5% in women) and high PPV (>70) in a context where this disorder is highly prevalent. These data are also consistent with those obtained by PSG 34,35 .
As is appreciated in the literature, the clinical correlation is of great value when simplified tests are used.
The use of daytime sleepiness and tiredness symptoms is widespread in different sleep units around the world, because it allows the clinician to check the correlation of sleep study results with self-reported symptoms. However, due to its low correlation with objective indicators, the usefulness of this strategy may only be verified in symptomatic patients.

Study Limitations
Our study was conducted on patients from one single center using retrospective analysis, with the limitations inherent to this type of study design. Geographic, social, and cultural factors make it difficult to extrapolate our results to other populations and health systems.
The identification of respiratory events in RP recordings could also be a limitation of our study because AHI was calculated by manual reading. Thus, such index is the result of the addition of the number of apneas/hypopneas per hour of recorded time, i.e. a quotient of the number of observed events and time of exposure.

CONCLUSIONS
ESS had a limited value in the identification of patients' severity and should be considered in the context of an objective evaluation together with AHI. In our experience, ESS scores > 10 have a limited discrimination capacity and are useful especially in male patients. Tiredness as a symptom performed even worse and was useful only when reported by male patients with severe OSA.